Overview

Dataset statistics

Number of variables19
Number of observations500
Missing cells2716
Missing cells (%)28.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory74.3 KiB
Average record size in memory152.3 B

Variable types

Numeric11
Categorical7
Boolean1

Alerts

EFF_DT has a high cardinality: 482 distinct valuesHigh cardinality
BKG_DT has a high cardinality: 485 distinct valuesHigh cardinality
FNCT_CCY has a high cardinality: 144 distinct valuesHigh cardinality
TXN_CCY has a high cardinality: 143 distinct valuesHigh cardinality
SRC_SYS_ID has 93 (18.6%) missing valuesMissing
FRS_BU has 133 (26.6%) missing valuesMissing
FRS_AFFL_CD has 170 (34.0%) missing valuesMissing
ACTG_UNIT_ID has 212 (42.4%) missing valuesMissing
GOC has 288 (57.6%) missing valuesMissing
MNGD_SEG has 169 (33.8%) missing valuesMissing
BASE_CCY_AMT has 259 (51.8%) missing valuesMissing
FNCT_CCY_AMT has 137 (27.4%) missing valuesMissing
ENTRPS_PROD_CD has 136 (27.2%) missing valuesMissing
TXN_CCY_AMT has 50 (10.0%) missing valuesMissing
CITI_LV has 195 (39.0%) missing valuesMissing
ib_flag has 149 (29.8%) missing valuesMissing
segr_flag has 138 (27.6%) missing valuesMissing
FRS_ACCOUNT_CLASS has 163 (32.6%) missing valuesMissing
GAAP_TYP_CD has 132 (26.4%) missing valuesMissing
FNCT_CCY has 144 (28.8%) missing valuesMissing
TXN_CCY has 148 (29.6%) missing valuesMissing
EFF_DT is uniformly distributedUniform
BKG_DT is uniformly distributedUniform
FNCT_CCY is uniformly distributedUniform

Reproduction

Analysis started2023-05-04 17:09:40.551470
Analysis finished2023-05-04 17:10:24.986073
Duration44.43 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

SRC_SYS_ID
Real number (ℝ)

Distinct407
Distinct (%)100.0%
Missing93
Missing (%)18.6%
Infinite0
Infinite (%)0.0%
Mean88536.511
Minimum30299
Maximum149907
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2023-05-04T13:10:25.181479image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum30299
5-th percentile35981.5
Q157822.5
median88834
Q3116933
95-th percentile141242
Maximum149907
Range119608
Interquartile range (IQR)59110.5

Descriptive statistics

Standard deviation34736.203
Coefficient of variation (CV)0.39233761
Kurtosis-1.2035208
Mean88536.511
Median Absolute Deviation (MAD)29590
Skewness0.017185501
Sum36034360
Variance1.2066038 × 109
MonotonicityNot monotonic
2023-05-04T13:10:25.458874image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
89377 1
 
0.2%
137820 1
 
0.2%
58241 1
 
0.2%
100403 1
 
0.2%
133132 1
 
0.2%
137737 1
 
0.2%
78707 1
 
0.2%
140196 1
 
0.2%
96161 1
 
0.2%
88006 1
 
0.2%
Other values (397) 397
79.4%
(Missing) 93
 
18.6%
ValueCountFrequency (%)
30299 1
0.2%
30470 1
0.2%
30882 1
0.2%
30981 1
0.2%
31219 1
0.2%
31237 1
0.2%
31523 1
0.2%
31584 1
0.2%
31783 1
0.2%
32104 1
0.2%
ValueCountFrequency (%)
149907 1
0.2%
149779 1
0.2%
149604 1
0.2%
149383 1
0.2%
147888 1
0.2%
147871 1
0.2%
147401 1
0.2%
147226 1
0.2%
146506 1
0.2%
146206 1
0.2%

FRS_BU
Real number (ℝ)

Distinct367
Distinct (%)100.0%
Missing133
Missing (%)26.6%
Infinite0
Infinite (%)0.0%
Mean31018.918
Minimum11038
Maximum49979
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2023-05-04T13:10:25.785592image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum11038
5-th percentile12204.9
Q121045
median30548
Q340637
95-th percentile48244.6
Maximum49979
Range38941
Interquartile range (IQR)19592

Descriptive statistics

Standard deviation11460.234
Coefficient of variation (CV)0.36945949
Kurtosis-1.2103425
Mean31018.918
Median Absolute Deviation (MAD)9829
Skewness-0.036353792
Sum11383943
Variance1.3133696 × 108
MonotonicityNot monotonic
2023-05-04T13:10:26.034010image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37123 1
 
0.2%
35646 1
 
0.2%
44096 1
 
0.2%
11117 1
 
0.2%
48190 1
 
0.2%
11687 1
 
0.2%
47107 1
 
0.2%
12153 1
 
0.2%
30627 1
 
0.2%
44463 1
 
0.2%
Other values (357) 357
71.4%
(Missing) 133
 
26.6%
ValueCountFrequency (%)
11038 1
0.2%
11117 1
0.2%
11156 1
0.2%
11355 1
0.2%
11433 1
0.2%
11634 1
0.2%
11645 1
0.2%
11671 1
0.2%
11680 1
0.2%
11687 1
0.2%
ValueCountFrequency (%)
49979 1
0.2%
49566 1
0.2%
49485 1
0.2%
49436 1
0.2%
49435 1
0.2%
49163 1
0.2%
49008 1
0.2%
48868 1
0.2%
48741 1
0.2%
48723 1
0.2%

FRS_AFFL_CD
Real number (ℝ)

Distinct330
Distinct (%)100.0%
Missing170
Missing (%)34.0%
Infinite0
Infinite (%)0.0%
Mean13014.6
Minimum11013
Maximum14989
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2023-05-04T13:10:26.303272image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum11013
5-th percentile11291.95
Q112041.25
median12890.5
Q314027.75
95-th percentile14758.55
Maximum14989
Range3976
Interquartile range (IQR)1986.5

Descriptive statistics

Standard deviation1145.9344
Coefficient of variation (CV)0.088049914
Kurtosis-1.2239847
Mean13014.6
Median Absolute Deviation (MAD)1015.5
Skewness0.046990188
Sum4294818
Variance1313165.7
MonotonicityNot monotonic
2023-05-04T13:10:26.565435image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14356 1
 
0.2%
11877 1
 
0.2%
12748 1
 
0.2%
14292 1
 
0.2%
14502 1
 
0.2%
14015 1
 
0.2%
14230 1
 
0.2%
14528 1
 
0.2%
12193 1
 
0.2%
13352 1
 
0.2%
Other values (320) 320
64.0%
(Missing) 170
34.0%
ValueCountFrequency (%)
11013 1
0.2%
11034 1
0.2%
11043 1
0.2%
11053 1
0.2%
11075 1
0.2%
11082 1
0.2%
11092 1
0.2%
11112 1
0.2%
11125 1
0.2%
11145 1
0.2%
ValueCountFrequency (%)
14989 1
0.2%
14979 1
0.2%
14961 1
0.2%
14950 1
0.2%
14947 1
0.2%
14939 1
0.2%
14935 1
0.2%
14906 1
0.2%
14895 1
0.2%
14888 1
0.2%

ACTG_UNIT_ID
Real number (ℝ)

Distinct288
Distinct (%)100.0%
Missing212
Missing (%)42.4%
Infinite0
Infinite (%)0.0%
Mean2575.8507
Minimum41
Maximum4999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2023-05-04T13:10:26.870632image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum41
5-th percentile242.5
Q11367.25
median2572.5
Q33806
95-th percentile4788.5
Maximum4999
Range4958
Interquartile range (IQR)2438.75

Descriptive statistics

Standard deviation1422.4635
Coefficient of variation (CV)0.55223057
Kurtosis-1.1000743
Mean2575.8507
Median Absolute Deviation (MAD)1229.5
Skewness-0.045873221
Sum741845
Variance2023402.4
MonotonicityNot monotonic
2023-05-04T13:10:27.125650image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1971 1
 
0.2%
3149 1
 
0.2%
1906 1
 
0.2%
249 1
 
0.2%
3887 1
 
0.2%
321 1
 
0.2%
3504 1
 
0.2%
187 1
 
0.2%
1190 1
 
0.2%
4595 1
 
0.2%
Other values (278) 278
55.6%
(Missing) 212
42.4%
ValueCountFrequency (%)
41 1
0.2%
59 1
0.2%
71 1
0.2%
81 1
0.2%
85 1
0.2%
93 1
0.2%
94 1
0.2%
97 1
0.2%
155 1
0.2%
162 1
0.2%
ValueCountFrequency (%)
4999 1
0.2%
4998 1
0.2%
4978 1
0.2%
4970 1
0.2%
4945 1
0.2%
4940 1
0.2%
4912 1
0.2%
4897 1
0.2%
4886 1
0.2%
4868 1
0.2%

GOC
Real number (ℝ)

Distinct212
Distinct (%)100.0%
Missing288
Missing (%)57.6%
Infinite0
Infinite (%)0.0%
Mean2.0235358 × 108
Minimum12480153
Maximum5.8692296 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2023-05-04T13:10:27.408477image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum12480153
5-th percentile12481000
Q112484806
median2.3940403 × 108
Q32.3941158 × 108
95-th percentile5.8692249 × 108
Maximum5.8692296 × 108
Range5.744428 × 108
Interquartile range (IQR)2.2692677 × 108

Descriptive statistics

Standard deviation2.2067353 × 108
Coefficient of variation (CV)1.0905343
Kurtosis-0.79144914
Mean2.0235358 × 108
Median Absolute Deviation (MAD)2.2691833 × 108
Skewness0.80483348
Sum4.289896 × 1010
Variance4.8696807 × 1016
MonotonicityNot monotonic
2023-05-04T13:10:27.679216image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
239408378 1
 
0.2%
239404468 1
 
0.2%
239411299 1
 
0.2%
239406851 1
 
0.2%
239413136 1
 
0.2%
239409822 1
 
0.2%
239404106 1
 
0.2%
239409196 1
 
0.2%
239408285 1
 
0.2%
239407546 1
 
0.2%
Other values (202) 202
40.4%
(Missing) 288
57.6%
ValueCountFrequency (%)
12480153 1
0.2%
12480212 1
0.2%
12480296 1
0.2%
12480398 1
0.2%
12480411 1
0.2%
12480530 1
0.2%
12480555 1
0.2%
12480585 1
0.2%
12480654 1
0.2%
12480667 1
0.2%
ValueCountFrequency (%)
586922958 1
0.2%
586922913 1
0.2%
586922893 1
0.2%
586922777 1
0.2%
586922729 1
0.2%
586922643 1
0.2%
586922623 1
0.2%
586922586 1
0.2%
586922579 1
0.2%
586922559 1
0.2%

MNGD_SEG
Real number (ℝ)

Distinct331
Distinct (%)100.0%
Missing169
Missing (%)33.8%
Infinite0
Infinite (%)0.0%
Mean37443.166
Minimum3376
Maximum74921
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2023-05-04T13:10:27.941375image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum3376
5-th percentile5591
Q118208
median35568
Q356816.5
95-th percentile71595
Maximum74921
Range71545
Interquartile range (IQR)38608.5

Descriptive statistics

Standard deviation21397.916
Coefficient of variation (CV)0.57147722
Kurtosis-1.2602316
Mean37443.166
Median Absolute Deviation (MAD)19293
Skewness0.0760918
Sum12393688
Variance4.5787083 × 108
MonotonicityNot monotonic
2023-05-04T13:10:28.224702image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
54926 1
 
0.2%
71451 1
 
0.2%
42477 1
 
0.2%
33567 1
 
0.2%
48742 1
 
0.2%
31352 1
 
0.2%
62363 1
 
0.2%
4314 1
 
0.2%
27573 1
 
0.2%
27188 1
 
0.2%
Other values (321) 321
64.2%
(Missing) 169
33.8%
ValueCountFrequency (%)
3376 1
0.2%
3470 1
0.2%
3597 1
0.2%
3699 1
0.2%
4205 1
0.2%
4314 1
0.2%
4754 1
0.2%
4766 1
0.2%
4847 1
0.2%
4853 1
0.2%
ValueCountFrequency (%)
74921 1
0.2%
74840 1
0.2%
74714 1
0.2%
74681 1
0.2%
74305 1
0.2%
74228 1
0.2%
73936 1
0.2%
73879 1
0.2%
73465 1
0.2%
73209 1
0.2%

BASE_CCY_AMT
Real number (ℝ)

Distinct241
Distinct (%)100.0%
Missing259
Missing (%)51.8%
Infinite0
Infinite (%)0.0%
Mean18987.895
Minimum-288533.95
Maximum271052.1
Zeros0
Zeros (%)0.0%
Negative97
Negative (%)19.4%
Memory size4.0 KiB
2023-05-04T13:10:28.528037image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-288533.95
5-th percentile-130837.1
Q1-37570.492
median17343.536
Q379620.254
95-th percentile173503.06
Maximum271052.1
Range559586.06
Interquartile range (IQR)117190.75

Descriptive statistics

Standard deviation93599.776
Coefficient of variation (CV)4.9294445
Kurtosis0.40667477
Mean18987.895
Median Absolute Deviation (MAD)58362.134
Skewness-0.18049233
Sum4576082.7
Variance8.760918 × 109
MonotonicityNot monotonic
2023-05-04T13:10:28.775006image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-44833.413 1
 
0.2%
2930.122 1
 
0.2%
31282.159 1
 
0.2%
134385.318 1
 
0.2%
118093.832 1
 
0.2%
-118718.714 1
 
0.2%
105061.914 1
 
0.2%
-20758.632 1
 
0.2%
-55830.93 1
 
0.2%
203307.055 1
 
0.2%
Other values (231) 231
46.2%
(Missing) 259
51.8%
ValueCountFrequency (%)
-288533.954 1
0.2%
-268590.032 1
0.2%
-244230.521 1
0.2%
-208253.208 1
0.2%
-192804.416 1
0.2%
-168063.322 1
0.2%
-156592.471 1
0.2%
-155096.867 1
0.2%
-152737.846 1
0.2%
-150431.197 1
0.2%
ValueCountFrequency (%)
271052.103 1
0.2%
240732.589 1
0.2%
221636.604 1
0.2%
219576.436 1
0.2%
210386.764 1
0.2%
205815.698 1
0.2%
203307.055 1
0.2%
195558.866 1
0.2%
191006.462 1
0.2%
180015.824 1
0.2%

FNCT_CCY_AMT
Real number (ℝ)

Distinct363
Distinct (%)100.0%
Missing137
Missing (%)27.4%
Infinite0
Infinite (%)0.0%
Mean460206.98
Minimum-2434033.8
Maximum3160114.6
Zeros0
Zeros (%)0.0%
Negative124
Negative (%)24.8%
Memory size4.0 KiB
2023-05-04T13:10:29.294423image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-2434033.8
5-th percentile-1138222.3
Q1-239795.77
median420493
Q31150948.7
95-th percentile2180152.1
Maximum3160114.6
Range5594148.4
Interquartile range (IQR)1390744.4

Descriptive statistics

Standard deviation995168.71
Coefficient of variation (CV)2.1624372
Kurtosis-0.23109228
Mean460206.98
Median Absolute Deviation (MAD)684301.04
Skewness0.081152121
Sum1.6705514 × 108
Variance9.9036076 × 1011
MonotonicityNot monotonic
2023-05-04T13:10:29.558176image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-15523.959 1
 
0.2%
247430.22 1
 
0.2%
-1342161.295 1
 
0.2%
-562170.904 1
 
0.2%
-511066.453 1
 
0.2%
-239309.644 1
 
0.2%
358194.845 1
 
0.2%
158024.849 1
 
0.2%
1332562.018 1
 
0.2%
1042262.496 1
 
0.2%
Other values (353) 353
70.6%
(Missing) 137
 
27.4%
ValueCountFrequency (%)
-2434033.762 1
0.2%
-2140647.272 1
0.2%
-1916794.405 1
0.2%
-1726109.593 1
0.2%
-1647295.471 1
0.2%
-1600178.452 1
0.2%
-1578609.913 1
0.2%
-1475017.332 1
0.2%
-1460906.872 1
0.2%
-1443764.842 1
0.2%
ValueCountFrequency (%)
3160114.597 1
0.2%
2972165.912 1
0.2%
2836324.624 1
0.2%
2808438.736 1
0.2%
2711341.818 1
0.2%
2630543.318 1
0.2%
2558025.202 1
0.2%
2512439.158 1
0.2%
2438191.854 1
0.2%
2423255.847 1
0.2%

ENTRPS_PROD_CD
Real number (ℝ)

Distinct232
Distinct (%)63.7%
Missing136
Missing (%)27.2%
Infinite0
Infinite (%)0.0%
Mean1307.9533
Minimum1102
Maximum1498
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2023-05-04T13:10:29.841132image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1102
5-th percentile1127.15
Q11212
median1312
Q31404
95-th percentile1483
Maximum1498
Range396
Interquartile range (IQR)192

Descriptive statistics

Standard deviation114.51029
Coefficient of variation (CV)0.08754922
Kurtosis-1.1631046
Mean1307.9533
Median Absolute Deviation (MAD)95.5
Skewness-0.077027785
Sum476095
Variance13112.607
MonotonicityNot monotonic
2023-05-04T13:10:30.141222image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1242 5
 
1.0%
1312 4
 
0.8%
1110 4
 
0.8%
1380 4
 
0.8%
1489 4
 
0.8%
1493 4
 
0.8%
1369 4
 
0.8%
1362 4
 
0.8%
1308 3
 
0.6%
1370 3
 
0.6%
Other values (222) 325
65.0%
(Missing) 136
27.2%
ValueCountFrequency (%)
1102 1
 
0.2%
1103 2
0.4%
1105 1
 
0.2%
1106 3
0.6%
1108 1
 
0.2%
1109 1
 
0.2%
1110 4
0.8%
1111 1
 
0.2%
1113 1
 
0.2%
1118 1
 
0.2%
ValueCountFrequency (%)
1498 1
 
0.2%
1497 2
0.4%
1496 1
 
0.2%
1495 1
 
0.2%
1493 4
0.8%
1491 1
 
0.2%
1490 2
0.4%
1489 4
0.8%
1488 1
 
0.2%
1485 1
 
0.2%

TXN_CCY_AMT
Real number (ℝ)

Distinct450
Distinct (%)100.0%
Missing50
Missing (%)10.0%
Infinite0
Infinite (%)0.0%
Mean154316.31
Minimum-531204.64
Maximum865135.14
Zeros0
Zeros (%)0.0%
Negative125
Negative (%)25.0%
Memory size4.0 KiB
2023-05-04T13:10:30.441206image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-531204.64
5-th percentile-229415.38
Q1-21573.834
median154039.74
Q3324141.06
95-th percentile550578.85
Maximum865135.14
Range1396339.8
Interquartile range (IQR)345714.89

Descriptive statistics

Standard deviation237797.34
Coefficient of variation (CV)1.5409735
Kurtosis-0.3227069
Mean154316.31
Median Absolute Deviation (MAD)173874.61
Skewness0.025537128
Sum69442339
Variance5.6547575 × 1010
MonotonicityNot monotonic
2023-05-04T13:10:30.707933image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-34587.403 1
 
0.2%
536743.691 1
 
0.2%
403297.51 1
 
0.2%
205937.747 1
 
0.2%
173717.56 1
 
0.2%
430468.839 1
 
0.2%
-73795.275 1
 
0.2%
34673.308 1
 
0.2%
-34679.051 1
 
0.2%
553729.9 1
 
0.2%
Other values (440) 440
88.0%
(Missing) 50
 
10.0%
ValueCountFrequency (%)
-531204.639 1
0.2%
-418197.725 1
0.2%
-389183.76 1
0.2%
-365002.678 1
0.2%
-359197.623 1
0.2%
-353324.042 1
0.2%
-352708.853 1
0.2%
-345272.1 1
0.2%
-340836.678 1
0.2%
-340386.08 1
0.2%
ValueCountFrequency (%)
865135.144 1
0.2%
759775.446 1
0.2%
737590.109 1
0.2%
727082.396 1
0.2%
675521.854 1
0.2%
658324.876 1
0.2%
655198.127 1
0.2%
653214.353 1
0.2%
647934.005 1
0.2%
633579.62 1
0.2%

CITI_LV
Real number (ℝ)

Distinct299
Distinct (%)98.0%
Missing195
Missing (%)39.0%
Infinite0
Infinite (%)0.0%
Mean15129.662
Minimum11011
Maximum18995
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2023-05-04T13:10:30.999225image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum11011
5-th percentile11380
Q112997
median15256
Q317245
95-th percentile18739.2
Maximum18995
Range7984
Interquartile range (IQR)4248

Descriptive statistics

Standard deviation2385.4398
Coefficient of variation (CV)0.15766643
Kurtosis-1.231152
Mean15129.662
Median Absolute Deviation (MAD)2086
Skewness-0.085699951
Sum4614547
Variance5690322.9
MonotonicityNot monotonic
2023-05-04T13:10:31.274508image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13082 2
 
0.4%
11662 2
 
0.4%
18047 2
 
0.4%
18893 2
 
0.4%
17322 2
 
0.4%
16229 2
 
0.4%
18887 1
 
0.2%
15948 1
 
0.2%
16459 1
 
0.2%
16522 1
 
0.2%
Other values (289) 289
57.8%
(Missing) 195
39.0%
ValueCountFrequency (%)
11011 1
0.2%
11028 1
0.2%
11044 1
0.2%
11063 1
0.2%
11065 1
0.2%
11104 1
0.2%
11125 1
0.2%
11155 1
0.2%
11166 1
0.2%
11188 1
0.2%
ValueCountFrequency (%)
18995 1
0.2%
18897 1
0.2%
18894 1
0.2%
18893 2
0.4%
18887 1
0.2%
18884 1
0.2%
18872 1
0.2%
18861 1
0.2%
18795 1
0.2%
18794 1
0.2%

EFF_DT
Categorical

HIGH CARDINALITY
UNIFORM

Distinct482
Distinct (%)96.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
12/03/2018
 
2
03/04/2004
 
2
02/10/2017
 
2
10/09/2003
 
2
04/13/2003
 
2
Other values (477)
490 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters5000
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique464 ?
Unique (%)92.8%

Sample

1st row01/14/2011
2nd row12/12/2007
3rd row10/15/2008
4th row09/12/1999
5th row08/05/2010

Common Values

ValueCountFrequency (%)
12/03/2018 2
 
0.4%
03/04/2004 2
 
0.4%
02/10/2017 2
 
0.4%
10/09/2003 2
 
0.4%
04/13/2003 2
 
0.4%
01/26/2021 2
 
0.4%
07/17/2014 2
 
0.4%
08/12/2016 2
 
0.4%
09/14/2019 2
 
0.4%
01/12/1997 2
 
0.4%
Other values (472) 480
96.0%

Length

2023-05-04T13:10:31.541212image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
12/03/2018 2
 
0.4%
03/18/2017 2
 
0.4%
03/04/2004 2
 
0.4%
06/15/2015 2
 
0.4%
04/27/2001 2
 
0.4%
01/17/2010 2
 
0.4%
02/06/2019 2
 
0.4%
02/27/2013 2
 
0.4%
07/26/2018 2
 
0.4%
03/30/2008 2
 
0.4%
Other values (472) 480
96.0%

Most occurring characters

ValueCountFrequency (%)
0 1294
25.9%
/ 1000
20.0%
2 831
16.6%
1 785
15.7%
9 267
 
5.3%
7 157
 
3.1%
8 156
 
3.1%
3 152
 
3.0%
6 132
 
2.6%
5 124
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4000
80.0%
Other Punctuation 1000
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1294
32.4%
2 831
20.8%
1 785
19.6%
9 267
 
6.7%
7 157
 
3.9%
8 156
 
3.9%
3 152
 
3.8%
6 132
 
3.3%
5 124
 
3.1%
4 102
 
2.5%
Other Punctuation
ValueCountFrequency (%)
/ 1000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1294
25.9%
/ 1000
20.0%
2 831
16.6%
1 785
15.7%
9 267
 
5.3%
7 157
 
3.1%
8 156
 
3.1%
3 152
 
3.0%
6 132
 
2.6%
5 124
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1294
25.9%
/ 1000
20.0%
2 831
16.6%
1 785
15.7%
9 267
 
5.3%
7 157
 
3.1%
8 156
 
3.1%
3 152
 
3.0%
6 132
 
2.6%
5 124
 
2.5%

BKG_DT
Categorical

HIGH CARDINALITY
UNIFORM

Distinct485
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
03/10/2011
 
2
05/27/2005
 
2
01/22/1999
 
2
01/08/2003
 
2
05/25/2010
 
2
Other values (480)
490 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters5000
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique470 ?
Unique (%)94.0%

Sample

1st row03/10/2011
2nd row11/12/2007
3rd row10/05/2008
4th row09/09/1999
5th row08/11/2010

Common Values

ValueCountFrequency (%)
03/10/2011 2
 
0.4%
05/27/2005 2
 
0.4%
01/22/1999 2
 
0.4%
01/08/2003 2
 
0.4%
05/25/2010 2
 
0.4%
03/23/2015 2
 
0.4%
03/04/2017 2
 
0.4%
12/05/2014 2
 
0.4%
09/18/2008 2
 
0.4%
01/14/2011 2
 
0.4%
Other values (475) 480
96.0%

Length

2023-05-04T13:10:31.757474image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
03/10/2011 2
 
0.4%
09/18/2008 2
 
0.4%
05/27/2005 2
 
0.4%
09/14/2007 2
 
0.4%
08/01/2019 2
 
0.4%
01/12/2017 2
 
0.4%
12/18/1997 2
 
0.4%
01/14/2011 2
 
0.4%
08/14/2020 2
 
0.4%
12/05/2014 2
 
0.4%
Other values (475) 480
96.0%

Most occurring characters

ValueCountFrequency (%)
0 1300
26.0%
/ 1000
20.0%
2 844
16.9%
1 772
15.4%
9 260
 
5.2%
8 152
 
3.0%
3 149
 
3.0%
7 149
 
3.0%
5 135
 
2.7%
4 122
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4000
80.0%
Other Punctuation 1000
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1300
32.5%
2 844
21.1%
1 772
19.3%
9 260
 
6.5%
8 152
 
3.8%
3 149
 
3.7%
7 149
 
3.7%
5 135
 
3.4%
4 122
 
3.0%
6 117
 
2.9%
Other Punctuation
ValueCountFrequency (%)
/ 1000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1300
26.0%
/ 1000
20.0%
2 844
16.9%
1 772
15.4%
9 260
 
5.2%
8 152
 
3.0%
3 149
 
3.0%
7 149
 
3.0%
5 135
 
2.7%
4 122
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1300
26.0%
/ 1000
20.0%
2 844
16.9%
1 772
15.4%
9 260
 
5.2%
8 152
 
3.0%
3 149
 
3.0%
7 149
 
3.0%
5 135
 
2.7%
4 122
 
2.4%

ib_flag
Boolean

Distinct2
Distinct (%)0.6%
Missing149
Missing (%)29.8%
Memory size1.1 KiB
False
176 
True
175 
(Missing)
149 
ValueCountFrequency (%)
False 176
35.2%
True 175
35.0%
(Missing) 149
29.8%
2023-05-04T13:10:31.986902image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

segr_flag
Categorical

Distinct3
Distinct (%)0.8%
Missing138
Missing (%)27.6%
Memory size4.0 KiB
Y
190 
N
169 
 
3

Length

Max length2
Median length1
Mean length1.0082873
Min length1

Characters and Unicode

Total characters365
Distinct characters3
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowY
2nd row
3rd rowN
4th rowN
5th row

Common Values

ValueCountFrequency (%)
Y 190
38.0%
N 169
33.8%
3
 
0.6%
(Missing) 138
27.6%

Length

2023-05-04T13:10:32.163697image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-04T13:10:32.358381image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
y 190
52.9%
n 169
47.1%

Most occurring characters

ValueCountFrequency (%)
Y 190
52.1%
N 169
46.3%
6
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 359
98.4%
Space Separator 6
 
1.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y 190
52.9%
N 169
47.1%
Space Separator
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 359
98.4%
Common 6
 
1.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 190
52.9%
N 169
47.1%
Common
ValueCountFrequency (%)
6
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 365
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 190
52.1%
N 169
46.3%
6
 
1.6%
Distinct2
Distinct (%)0.6%
Missing163
Missing (%)32.6%
Memory size4.0 KiB
CASH
172 
OVDFT-LIAB
165 

Length

Max length10
Median length4
Mean length6.9376855
Min length4

Characters and Unicode

Total characters2338
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOVDFT-LIAB
2nd rowCASH
3rd rowOVDFT-LIAB
4th rowCASH
5th rowOVDFT-LIAB

Common Values

ValueCountFrequency (%)
CASH 172
34.4%
OVDFT-LIAB 165
33.0%
(Missing) 163
32.6%

Length

2023-05-04T13:10:32.575350image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-04T13:10:32.791955image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
cash 172
51.0%
ovdft-liab 165
49.0%

Most occurring characters

ValueCountFrequency (%)
A 337
14.4%
C 172
 
7.4%
S 172
 
7.4%
H 172
 
7.4%
O 165
 
7.1%
V 165
 
7.1%
D 165
 
7.1%
F 165
 
7.1%
T 165
 
7.1%
- 165
 
7.1%
Other values (3) 495
21.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2173
92.9%
Dash Punctuation 165
 
7.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 337
15.5%
C 172
7.9%
S 172
7.9%
H 172
7.9%
O 165
7.6%
V 165
7.6%
D 165
7.6%
F 165
7.6%
T 165
7.6%
L 165
7.6%
Other values (2) 330
15.2%
Dash Punctuation
ValueCountFrequency (%)
- 165
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2173
92.9%
Common 165
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 337
15.5%
C 172
7.9%
S 172
7.9%
H 172
7.9%
O 165
7.6%
V 165
7.6%
D 165
7.6%
F 165
7.6%
T 165
7.6%
L 165
7.6%
Other values (2) 330
15.2%
Common
ValueCountFrequency (%)
- 165
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2338
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 337
14.4%
C 172
 
7.4%
S 172
 
7.4%
H 172
 
7.4%
O 165
 
7.1%
V 165
 
7.1%
D 165
 
7.1%
F 165
 
7.1%
T 165
 
7.1%
- 165
 
7.1%
Other values (3) 495
21.2%

GAAP_TYP_CD
Categorical

Distinct4
Distinct (%)1.1%
Missing132
Missing (%)26.4%
Memory size4.0 KiB
US_GAAP
149 
LCL_GAAP
78 
COM_GAAP
75 
I_GAAP
66 

Length

Max length8
Median length7
Mean length7.236413
Min length6

Characters and Unicode

Total characters2663
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUS_GAAP
2nd rowUS_GAAP
3rd rowUS_GAAP
4th rowUS_GAAP
5th rowCOM_GAAP

Common Values

ValueCountFrequency (%)
US_GAAP 149
29.8%
LCL_GAAP 78
15.6%
COM_GAAP 75
15.0%
I_GAAP 66
13.2%
(Missing) 132
26.4%

Length

2023-05-04T13:10:33.040602image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-04T13:10:33.325957image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
us_gaap 149
40.5%
lcl_gaap 78
21.2%
com_gaap 75
20.4%
i_gaap 66
17.9%

Most occurring characters

ValueCountFrequency (%)
A 736
27.6%
_ 368
13.8%
G 368
13.8%
P 368
13.8%
L 156
 
5.9%
C 153
 
5.7%
U 149
 
5.6%
S 149
 
5.6%
O 75
 
2.8%
M 75
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2295
86.2%
Connector Punctuation 368
 
13.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 736
32.1%
G 368
16.0%
P 368
16.0%
L 156
 
6.8%
C 153
 
6.7%
U 149
 
6.5%
S 149
 
6.5%
O 75
 
3.3%
M 75
 
3.3%
I 66
 
2.9%
Connector Punctuation
ValueCountFrequency (%)
_ 368
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2295
86.2%
Common 368
 
13.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 736
32.1%
G 368
16.0%
P 368
16.0%
L 156
 
6.8%
C 153
 
6.7%
U 149
 
6.5%
S 149
 
6.5%
O 75
 
3.3%
M 75
 
3.3%
I 66
 
2.9%
Common
ValueCountFrequency (%)
_ 368
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2663
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 736
27.6%
_ 368
13.8%
G 368
13.8%
P 368
13.8%
L 156
 
5.9%
C 153
 
5.7%
U 149
 
5.6%
S 149
 
5.6%
O 75
 
2.8%
M 75
 
2.8%

FNCT_CCY
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct144
Distinct (%)40.4%
Missing144
Missing (%)28.8%
Memory size4.0 KiB
ZAR
 
6
IMP
 
6
SAR
 
6
ZWD
 
6
TZS
 
5
Other values (139)
327 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1068
Distinct characters25
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)10.1%

Sample

1st rowGHS
2nd rowBND
3rd rowBND
4th rowBBD
5th rowMZN

Common Values

ValueCountFrequency (%)
ZAR 6
 
1.2%
IMP 6
 
1.2%
SAR 6
 
1.2%
ZWD 6
 
1.2%
TZS 5
 
1.0%
SZL 5
 
1.0%
YER 5
 
1.0%
BAM 5
 
1.0%
COP 5
 
1.0%
BBD 5
 
1.0%
Other values (134) 302
60.4%
(Missing) 144
28.8%

Length

2023-05-04T13:10:33.597537image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
zar 6
 
1.7%
sar 6
 
1.7%
zwd 6
 
1.7%
imp 6
 
1.7%
bam 5
 
1.4%
bsd 5
 
1.4%
cop 5
 
1.4%
bbd 5
 
1.4%
yer 5
 
1.4%
szl 5
 
1.4%
Other values (134) 302
84.8%

Most occurring characters

ValueCountFrequency (%)
D 96
 
9.0%
S 76
 
7.1%
R 75
 
7.0%
P 71
 
6.6%
K 66
 
6.2%
B 62
 
5.8%
L 59
 
5.5%
N 53
 
5.0%
A 53
 
5.0%
M 51
 
4.8%
Other values (15) 406
38.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1068
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
D 96
 
9.0%
S 76
 
7.1%
R 75
 
7.0%
P 71
 
6.6%
K 66
 
6.2%
B 62
 
5.8%
L 59
 
5.5%
N 53
 
5.0%
A 53
 
5.0%
M 51
 
4.8%
Other values (15) 406
38.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1068
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
D 96
 
9.0%
S 76
 
7.1%
R 75
 
7.0%
P 71
 
6.6%
K 66
 
6.2%
B 62
 
5.8%
L 59
 
5.5%
N 53
 
5.0%
A 53
 
5.0%
M 51
 
4.8%
Other values (15) 406
38.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1068
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
D 96
 
9.0%
S 76
 
7.1%
R 75
 
7.0%
P 71
 
6.6%
K 66
 
6.2%
B 62
 
5.8%
L 59
 
5.5%
N 53
 
5.0%
A 53
 
5.0%
M 51
 
4.8%
Other values (15) 406
38.0%

TXN_CCY
Categorical

HIGH CARDINALITY
MISSING

Distinct143
Distinct (%)40.6%
Missing148
Missing (%)29.6%
Memory size4.0 KiB
LTL
 
7
MAD
 
6
ALL
 
6
JMD
 
6
PHP
 
6
Other values (138)
321 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1056
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)9.9%

Sample

1st rowSVC
2nd rowBRL
3rd rowTVD
4th rowBDT
5th rowJMD

Common Values

ValueCountFrequency (%)
LTL 7
 
1.4%
MAD 6
 
1.2%
ALL 6
 
1.2%
JMD 6
 
1.2%
PHP 6
 
1.2%
FJD 5
 
1.0%
EUR 5
 
1.0%
SOS 5
 
1.0%
BTN 5
 
1.0%
TMT 5
 
1.0%
Other values (133) 296
59.2%
(Missing) 148
29.6%

Length

2023-05-04T13:10:33.825774image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ltl 7
 
2.0%
all 6
 
1.7%
jmd 6
 
1.7%
php 6
 
1.7%
mad 6
 
1.7%
tmt 5
 
1.4%
mdl 5
 
1.4%
btn 5
 
1.4%
sos 5
 
1.4%
eur 5
 
1.4%
Other values (133) 296
84.1%

Most occurring characters

ValueCountFrequency (%)
D 117
 
11.1%
S 75
 
7.1%
R 68
 
6.4%
L 67
 
6.3%
M 56
 
5.3%
N 56
 
5.3%
P 54
 
5.1%
T 52
 
4.9%
G 52
 
4.9%
A 51
 
4.8%
Other values (16) 408
38.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1056
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
D 117
 
11.1%
S 75
 
7.1%
R 68
 
6.4%
L 67
 
6.3%
M 56
 
5.3%
N 56
 
5.3%
P 54
 
5.1%
T 52
 
4.9%
G 52
 
4.9%
A 51
 
4.8%
Other values (16) 408
38.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 1056
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
D 117
 
11.1%
S 75
 
7.1%
R 68
 
6.4%
L 67
 
6.3%
M 56
 
5.3%
N 56
 
5.3%
P 54
 
5.1%
T 52
 
4.9%
G 52
 
4.9%
A 51
 
4.8%
Other values (16) 408
38.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1056
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
D 117
 
11.1%
S 75
 
7.1%
R 68
 
6.4%
L 67
 
6.3%
M 56
 
5.3%
N 56
 
5.3%
P 54
 
5.1%
T 52
 
4.9%
G 52
 
4.9%
A 51
 
4.8%
Other values (16) 408
38.6%

Interactions

2023-05-04T13:10:19.925591image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:52.091197image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:54.808596image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:57.907072image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:00.665622image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:03.304372image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:06.049079image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:08.858907image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:11.628255image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:14.214767image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:17.189922image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:20.174232image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:52.310351image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:55.097912image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:58.147559image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:00.901960image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:03.485172image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:06.318658image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:09.095803image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:11.858710image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:14.459254image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:17.428776image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:20.493159image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:52.532570image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:55.428500image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:58.383790image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:01.192718image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:03.687410image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:06.624785image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:09.332410image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:12.101311image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:14.712156image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:17.692558image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:20.743511image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:52.737258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:55.845120image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:58.676590image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:01.420978image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:03.875888image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:06.909892image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:09.807580image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:12.342291image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:14.962727image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:17.955374image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:20.975230image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:52.928093image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:56.165785image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:58.890148image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:01.619462image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:04.112811image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:07.125970image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:10.029056image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:12.575875image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:15.239204image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:18.174652image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:21.219521image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:53.102623image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:56.449382image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:59.124304image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:01.812314image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:04.390816image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:07.375785image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:10.242747image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:12.811783image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:15.491770image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:18.375204image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:21.507227image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:53.311399image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:56.684425image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:59.393768image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:02.037580image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:04.655557image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:07.644647image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:10.458715image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:13.026209image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:15.731777image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:18.621943image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:21.720021image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:53.547033image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:56.896225image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:59.618293image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:02.228101image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:04.901662image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:07.878383image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:10.681849image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:13.260667image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:15.969622image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:18.893569image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:21.986560image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:53.841680image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:57.123708image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:59.836901image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:02.656456image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:05.164284image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:08.142667image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:10.910420image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:13.495728image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:16.210465image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:19.114958image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:22.292861image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:54.148586image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:57.378051image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:00.105319image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:02.881887image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:05.459476image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:08.385310image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:11.158893image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:13.727746image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:16.689147image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:19.385017image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:22.674214image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:54.448784image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:09:57.641854image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:00.396237image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:03.097277image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:05.746274image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:08.625709image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:11.391850image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:13.951135image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:16.910884image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-05-04T13:10:19.652600image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2023-05-04T13:10:34.127514image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2023-05-04T13:10:34.556131image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-05-04T13:10:35.189356image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-05-04T13:10:35.629842image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-05-04T13:10:36.007133image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2023-05-04T13:10:36.334825image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2023-05-04T13:10:23.126724image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-05-04T13:10:23.707358image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-05-04T13:10:24.424385image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

SRC_SYS_IDFRS_BUFRS_AFFL_CDACTG_UNIT_IDGOCMNGD_SEGBASE_CCY_AMTFNCT_CCY_AMTENTRPS_PROD_CDTXN_CCY_AMTCITI_LVEFF_DTBKG_DTib_flagsegr_flagFRS_ACCOUNT_CLASSGAAP_TYP_CDFNCT_CCYTXN_CCY
097268.037123.0NaNNaN12486250.054926.0NaNNaN1448.0-34587.403NaN01/14/201103/10/2011NYOVDFT-LIABUS_GAAPGHSSVC
1NaN34759.0NaNNaNNaN61614.0NaNNaN1236.0159254.20411927.012/12/200711/12/2007YCASHNaNBNDBRL
2NaN32522.0NaN944.0NaN43359.0NaNNaNNaN336960.326NaN10/15/200810/05/2008YNOVDFT-LIABUS_GAAPBNDNaN
3142220.0NaNNaN186.012484279.0NaNNaN-1175084.5441408.0120884.520NaN09/12/199909/09/1999YNNaNUS_GAAPBBDNaN
434162.038463.011483.01958.0NaN72428.0-138661.589NaN1350.0-127826.876NaN08/05/201008/11/2010NCASHUS_GAAPMZNTVD
547786.0NaNNaN3752.012482521.034306.0NaN-1016383.6641172.0290891.54912450.011/05/202012/15/2020NYOVDFT-LIABCOM_GAAPNaNBDT
6NaN44086.014003.04864.0NaN23853.0NaNNaNNaN398256.869NaN01/23/201701/12/2017NaNNNaNCOM_GAAPSYPJMD
7NaN42688.011792.041.0NaN34018.0NaN1224781.1031333.0238877.82912792.002/06/201902/18/2019NNCASHCOM_GAAPJMDKHR
873774.0NaN14275.03182.0NaNNaN-29812.952-856057.738NaN366843.369NaN01/28/200203/09/2002NaNNaNNaNLBPPHP
958469.029995.0NaNNaNNaN53440.0-87900.843NaN1214.0-29383.891NaN06/18/201908/01/2019YNaNNaNI_GAAPILSUZS
SRC_SYS_IDFRS_BUFRS_AFFL_CDACTG_UNIT_IDGOCMNGD_SEGBASE_CCY_AMTFNCT_CCY_AMTENTRPS_PROD_CDTXN_CCY_AMTCITI_LVEFF_DTBKG_DTib_flagsegr_flagFRS_ACCOUNT_CLASSGAAP_TYP_CDFNCT_CCYTXN_CCY
490NaN15043.014353.0619.0NaN5941.0NaN489960.5651395.0278881.66716165.001/18/201301/12/2013NaNYNaNLCL_GAAPUAHBOB
491NaN19916.0NaN2287.0586922586.028092.0NaNNaNNaN445452.05611964.008/16/199909/29/1999NYCASHI_GAAPNPRNaN
492NaN12070.0NaNNaNNaN23375.0NaNNaNNaN-352708.85315604.008/15/202010/02/2020NYCASHCOM_GAAPNaNNaN
49336308.027451.014895.04270.0586921092.025708.0-26955.582-449970.4561370.027380.71312376.005/30/200707/24/2007YNCASHNaNNaNKMF
494NaN17191.014761.0NaNNaN13227.0NaNNaN1190.0NaN14659.004/12/200505/27/2005NNOVDFT-LIABNaNNaNNaN
49561654.019601.0NaNNaNNaNNaNNaNNaNNaN31436.407NaN10/04/201111/03/2011NaNNCASHCOM_GAAPCLPXPF
496114495.0NaNNaNNaNNaNNaNNaN133706.304NaN-232272.95911396.008/18/202209/20/2022YYOVDFT-LIABNaNMGAOMR
497137934.039219.014734.02723.0586921420.025509.0106852.742-372890.5651135.0409500.14813527.006/30/199806/23/1998YYCASHCOM_GAAPFJDRSD
49879237.048723.012709.04681.0586921822.029707.0-208253.208-689971.5641436.0508128.69818261.010/08/201412/05/2014NaNNaNCASHI_GAAPKESNaN
49990611.048524.011512.01683.0586922913.014890.016183.745105384.8181434.0-9386.86912682.007/08/199907/16/1999YNaNNaNLCL_GAAPZMWPEN